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ABSTRACT 

Gumpgookies, an ob jective-pro jective test of school 
achievement motivation for children 3 1/2 to 8 year, was reduced from 
100 to 75 items following extensive factor analyses. This revised 
test attempted to dissipate the effects of response sets of the 
subjects and was prepared in three versions — an individual form, a 
group form for non-readers, and a group form for readers. However, 
the problem of response sets remained, and therefore a factor 
analytic procedure was devised to partial response sets out of an 
item intercorrelation matrix, resulting in a program that yielded 
orthogonal factors that are completely uncorrelated with the response 
set scores. (Author/MS) 
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A NEW APPROACH TO RESPONSE SETS IN ANALYSIS OF A TEST OF MOTIVATION TO ACHIEVE* 



Gurapgo okies is an objective-projective test of motivation to achieve in 
school, intended for children in an age range of three and a half to eight 
or possibly nine. Each item consists of a description of two imaginary 
figures called gumpgookies, and the task of the child is to indicate with 
which gumpgookie he identifies. 

For example: 



The first form of the test consisted of 100 items in which illustrations 
of Gumpgookies were presented in left-hand or right-hand positions and in 
which the left-hand figure was always described first. 

Factor analyses of data from this first form yielded factors which, 
although suggestive of substantive interpretations, seemed to be influenced 
by the positions of the answers and/or primacy versus recency, i.e,* whether 
the keyed answer was presented first or last. Thus some factors were loaded 
for items with answers predominantly in the right-hand position, some for 
items with answers predominantly in the left-hand position. And some factors 
tended to be loaded for items with answers presented first, others for items 
with answers presented last. 

* The research reported herein was performed pursuant to a contract with 
the Office of Economic Opportunity, Executive Office of the President, 
Washington, D. C. 2050&. The opinions expressed herein are those of the 
authors and should not be construed as representing the opinions or policy 
of any agency of the United States Government. 



Dorothy C. Adkins, University of Hawaii 
Bonnie L. Eallif, Fordham University 



These gumpgookies should be working. 



This one is watching. 



This one is working. 




2 






In an effort to dissipate the effects of these response sets of the sub- 
jects, a new format was devised, whereby the alternatives were presented in 
varying positions- -up and down, left and right, upper left and lower right, 
and upper right and lower left* At the same time, the order in which the 
alternatives were presented by the examiner was randomized* The number of 
items was reduced from 100 to 75, and most of the alternatives were revised 
to reduce their cognitive or verbal difficulty* This test was prepared in 
three versions, an individual form, a group for»s for non- readers, and a group 
form for readers. 

A previous report to the Office of Economic Opportunity, available through 
ERXC, described a number of factor analyses of data on these forms of the 
Gumpgookies test that had been completed by November, 1969 (Adkins, Ballif, 
1970a)* Although the results were interpreted in terms of substantive factors, 
the interpretations were still clouded by the troublesome influence of two main 
types of extraneous influences or response sets: the effects of the posi- 
tions of the answers to the items and the influence of the order in which the 
keyed answer is presented. A later publication presents the hypotheses on 
which the test is based and discusses their relation to empirically determined 
factors for the test in randomised format (Adkins and Ballif, 1970b). 

In retrospect, it appears that the effort to get rid of the effects of 
response sets by means of revising the format were not successful. 

Extraneous influences were still in evidence and bad only become somewhat 
more difficult to detect. Parenthetically, it should be noted that these 
response sets have no systematic undesirable influence on total score on 
the test, because the subject is expected to get only a chance score on 
items to which he responds on the basis of a particular set* But the 
response sets do affect the items that are loaded on particular factors, so 




that a subject could get unwarrantedly high or low scores on the separate 
factors. Moreover, the effects of response sets on the composition of the 
factors made their interpretations very tenuous. 

Since, disappointingly, the change in format had not been successful, 
another type of solution to the response set problem had to be found before 
factors could be interpreted with any assurance. The next approach that 
was pursued was based on the idea of computing response set scores for each 
subject, partialling these out of the item intercorrelation matrix, aud 
factoring the resulting matrix. Hence for each subject were computed the 
number of answers he chose that were in the left-hand position, the number 
of answers he chose that ware in the up position, and the number of answers 
he chose that had been presented first. In the case of the items in which 
alternatives had been placed in a diagonal position, e.g., upper left and 
lower right, an arbitrary decision was made to regard upper left and upper 
right as up, lower left and lower right as down. This was done because 
the small numbers of items involved in the two diagonal positions would have 
resulted in response set scores of very low reliability for these positions. 

The problem of developing the mathematical solution for partialling , 
these three variables out of an item intercorrelation matrix was presented 
to Dr. Paul Horst, whose technical report will appear later. The details of 
the computer program to effect the solution were worked out by Renato 
Espinosa and Robert Bloedon, members of the Hawaii Center staff, with the 
guidance of Dr. Horst. The result is a program that yields orthogonal 
factors that are completely uncorrelated with the response set scores. 

The complete program includes routines that provide, among other things 
the correlations of each item with the response set scores, the rotated 
"partial" factor loadings for each item, and reliability estimates (KR-20) 



for each partial factor as well as for the response set scores. It prints, 
for each item for each partial factor, approximate integral weights of 
-1, 0, 1 that could be used in hand scoring to yield approximate factor 
scores. A weight of -1 for an item indicates that it is functioning as a 
suppressor variable. The program also yields exact factor scores for each 
subject, based upon regression weights for each item. The approximate 
factor scores correlate in the neighborhood of .90 with the exact scores. 

It should be understood that, for any group of subjects sufficiently large 
to warrant a factor analysis, exact factor scores would be used if separate 
factor scores are desired. However, if a factor analysis resulting in 
approximate integral weights is available for a large sample and an investi- 
gator has only a small sample of the same kind of subject, the solution of 
approximate factor scores by means of the integral weights might be 
serviceable. These could be obtained by hand scoring. 

For the first factor analyses that had been run on the earliest form 
of the test, the number of factors was not specified, and it was naively 
supposed that the computations should be continued until the eigenvalue 
dropped below unity. It was found, however, that this had not occurred 
even when some forty factors had been extracted. It was clear that each of 
such a large number of factors would be determined primarily by so few items 
as to make their interpretation impossible. Moreover, because of the theoret- 
ical basis of the test, it was thought that probably no more than five or at 
most six factors would be interpretable. Hence the later analyses are largely 
confined to five factors, although some solutions with three, four, six, and 
eight factors were obtained. 

Separate analyses were made for 1813 four-year-old children, for 128 
first-graders, for 122 second- graders, for 250 first- and second-graders 



combined, and finally for a total group of 2313 children. Not surprisingly, 
the KR-20 values for the partial factors teuded to be. higher for the older 
children. It should also be mentioned that for all groups the KR-20 values 
for the partial factors tended to be less than for the factors based on the 
zero-order correlation matrix. This is doubtless true because the latter 
factors include reliable effects of response sets. Response set scores 
were more consistent for the older children. It was also interesting to 
find that factors for the older children shewed relatively more influence 
of a primacy-recency set, those for the younger children more influence of 
answer position sets, as indicated by correlations of response set scores 
both with individual items and with "unpartial" factor scores. 

Details of the extensive work that was done in comparing the several 
solutions for different numbers of factors and for different groups, as well 
as in comparing partial factors and unpartial factors, will not be presented 
here. They would only overwhelm the reader, as indeed they often threatened 
to do with the investigators. It had soon become apparent, with respect to 
both the original unpartial factors and the partial factors, that those for 
the four-year-olds did not correspond to those for the first- and second- 
graders as closely as had been expected. It was not unreasonable to suppose, 
however, that the factorial composition of motivation to achieve in school 
changes with age. Indeed, such is almost certainly the case. Yet, despite 
the conviction that changes with age in the factors affecting the test 
responses were to be expected, attempts to interpret the changes were not 
highly successful. 

Full exploration of this problem led to question as to the dependability 
of factor loadings obtained from phi coefficients based upon relatively 
small numbers of cases. Although the general plan of the investigations 




that have been done was to have at least 200 cases for any factor analysis, 
it seemed possible that this number was too small. Hence a plan was devised 
whereby routinely each sample was divided at random into halves and separate 
factor analyses were made for each half as well as for the total sample. 

Then the general plan was to investigate the similarity of the three sets 
of factor loadings for each sample by inspecting the correlations of the 
loadings from the three solutions, i.e., for the two half samples and for the 
total sample. In this approach, a factor for the total sample was regarded 
as verified when a factor in one half sample and a factor in the other half 
sample each shows its highest correlation for the same factor in the total 
sample at the same time that these same factors for the half samples have 
the highest correlation of any pair of factors across the half samples. Thus 
factor 2 for the first half sample might correlate .65 with factor 3 for the 
total; factor 3 for the second half sample might correlate .77 with factor 3 
for the total; and factor 2 for the first half sample might correlate .73 
with factor 3 for the second half sample. If these were the highest of the 
correlations inspected for these factors, the factor for the total sample 
would be regarded as verified. 

Results of a number of applications of this approach are presented in 
Tables 1, 2, 4, 5, 7, 8, 10, 11 and 14. The other tables show further 
comparisons among different factor analyses. 

Detailed inspection of these tables has led to the conclusion that the 
most defensible interpretation of factors results from the five-factor analysis 
based upon a group of 2313 cases, including 2063 four-year- olds and 250 
first- and second- graders. The five factors for this total group were 
verified more clearly than for any other subsample. Hence the interpretation 
of factors that can be offered now will be based upon this analysis for the 



total sample. At this stage, hox*ever, it will have become apparent that inter- 
pretations of factors gleaned from responses of young children to dichotomous 
items of the type in question are tenuous at best and must be based upon very 
large numbers of children. 

Although the KR-20 estimates of reliability for the total test score on 
Gumpgookies have been in the neighborhood of .85 to .90, the estimated reliabili- 
ties, as determined by ICR-20 coefficients, for the five factors of course are 
not so high, ranging from .35 to .55 for the large combined sample. This is 
not surprising, since the total test consists of only 75 items. An indicated 
next step, if any particular factor is to be explored more fully, will be to 
increase the number of items contributing strongly to that factor and have a 
single test for it. With an increase in number or items per factor, the factor 
score reliabilities may be expected to increase. 

For the interpretation of factors, once they have been verified by the 
method described above, the method has been first to list for a factor the 
items that have their highest loading on it for the total sample,. Then the 
loading of that item for the corresponding factor in each half sample is 
recorded, with a notation as to whether it is the highest loading for the item. 
Greatest weight is accorded those items for v;hich there is verification in 
all three analyses, i.e. , for which the highest loadings apply to the appropriate 
verified factors. Attention is also given to the size of loadings, those of 
about .30 or above tending to be associated with greater verifiability than 
those below ,30. 

In the discussion that follows, the factor numbers in parentheses follow- 
ing the letter designations refer to the numbers for the total group analysis 
and the two half- group analyses in Table 14. i 



Factor A (11, 1, 6) consists of items indicating an autonomous activity 
orientation permeating the use of time and interaction with others. This 
on-the-go behavior is more than generalized activity; it is initiating 
and engaging in specific behavior that is always appropriate to insure 
success in the particular tasks and situations st hand. It involves both 
knowing the effective instrumental steps and taking them. These activities 
are instrumental to achievement in general, e.g. , wanting to work longer; 
to achievement in school, e.g., keeps trying to write numbers; as well as 
to obtaining reinforcement for achievement, e.g., shows its paintings to 
others. Perhaps this interpretation can also include ways of thinking-- 
attitudes about school as instrumental covert behavior for success in school. 
If so, the few items suggesting that school and learning are liked are still 
consistent within this framework. In any case, the factor consists of 
thinking of and doing those appropriate activities that are instrumental to 
achievement. It might appropriately v be named instrumental activity. 

The reflection of a preference for school- and teacher -related experi- 
ences is clear in factor 3 (12, 2 3 9). The specific items include wanting 
to go to school to learn and liking learning along with watching and helping 
the teacher as opposed to playing or engaging in other activities. This 
positive attitude toward school is further exhibited by an identification 
with the teacher, e.g. ^wanting to be the teacher when playing school. 

Factor B, then, appears to be a school enjoyment factor. Because items 
dealing with work-like activities in non-school activities were only sparsely 
represented in the total test, however, the possibility that the factor 
would be better described as work enjoyment has not been ruled out. 

The items constituting factor C (13, 5, 8) represent the ability to 
evaluate one's own performance coupled with the confidence that the evalua- 
tion will be high. The process of self-evaluation is suggested by items 



portraying gumpgookies who know when their work is right , when they are doing 
well in school , what they can and cannot do, and whether or not they are 
always doing their best. Items describing gumpgookies who are self-evaluated 
as always at their best and doing well also suggest an awareness of their 
own excellence. Perhaps this factor can be considered an evaluative factor. 

Factor D (14, 4, 10) consists almost entirely of tiems set in competitive 
physical situations , e. g. , winning in running, climbing higher, and leading 
in follow the leader. Apparently it represents self-confidence in coming 
out on top, in being the best or better than the next one. With additional 
items staged in other settings , it seems likely that the factor would 
transcend physical activities. 

The common denominator for items loading on factor E (15, 3, 7) has to 
do with an awareness of implications of present behavior for the future-- 
perhaps even more specifically for accomplishment of a future goal. The 
gumpgookies in these items are still trying to obtain their future goals, 
e.g., trying to write. They are apparently directed by their own self- 
initiated purposes. This, then, appears to be some type of purpos ive 
factor. The need for further verification of this interpretation with addi- 
tional items or through experimental work is evident. 
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Correlations Among Loadings on Five Factors Based on Zero-Order Correlations for the 
First Half, Second Half, and Total Sample of 250 Hawaiian Children in Grades 1 and 2 
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Correlations Among Loadings on Four Factors Based on Zero-Order Correlations for the 
First Half, Second Half, and Total Sample of 25^ Hawaiian Children in trades 1 and 2 
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